Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make http://metadata.google.internal configurable #40317

Merged
merged 2 commits into from
Aug 7, 2020

Conversation

shawwn
Copy link
Contributor

@shawwn shawwn commented Jun 9, 2020

The TPU client library has a hardcoded dependency on http://metadata.google.internal. We happen to need to redirect this URL to a different VM. Since the URL is hardcoded, we're forced to use a fragile code patch against our version of Tensorflow, which isn't ideal, or rely on /etc/hosts to forward metadata.google.internal, which causes unexpected global side effects to the user's VM. (For example, GCE uses metadata.google.internal to distribute SSH keys to GCE VMs, which breaks when we reroute metadata.google.internal using /etc/hosts.)

oauth2client solves this by making http://metadata.google.internal configurable via the GCE_METADATA_IP environment variable. The final url becomes 'http://' + os.getenv('GCE_METADATA_IP', '169.254.169.254'):

https://github.com/googleapis/oauth2client/blob/50d20532a748f18e53f7d24ccbe6647132c979a9/oauth2client/client.py#L111

Following oauth2client's lead, this PR makes http://metadata.google.internal configurable for Tensorflow users via GCE_METADATA_IP:

_GCE_METADATA_URL_ENV_VARIABLE = 'GCE_METADATA_IP'

# ...

def _gce_metadata_endpoint():
  return 'http://' + os.environ.get(
    _GCE_METADATA_URL_ENV_VARIABLE,
    'metadata.google.internal')

GCE_METADATA_IP might seem like an awkward name. After all, metadata.google.internal is a URL, not an IP address. But it's probably best to match oauth2client's naming convention. That way users won't need to worry about setting two slightly-different variable names to configure both oauth2client and Tensorflow.

The TPU client library has a hardcoded dependency on `http://metadata.google.internal`. We happen to need to redirect this URL to a different VM. Since the URL is hardcoded, we're forced to use a fragile code patch against our version of Tensorflow, which isn't ideal, or rely on `/etc/hosts` to forward `metadata.google.internal`, which causes unexpected global side effects to the user's VM. (For example, GCE uses `metadata.google.internal` to distribute SSH keys to GCE VMs, which breaks when we reroute `metadata.google.internal` using `/etc/hosts`.)

oauth2client solves this by making `http://metadata.google.internal` configurable via the `GCE_METADATA_IP` environment variable. The final url becomes `'http://' + os.getenv('GCE_METADATA_IP', '169.254.169.254')`:

https://github.com/googleapis/oauth2client/blob/50d20532a748f18e53f7d24ccbe6647132c979a9/oauth2client/client.py#L111

Following oauth2client's lead, this PR makes `http://metadata.google.internal` configurable for Tensorflow users via `GCE_METADATA_IP`:

```py
_GCE_METADATA_URL_ENV_VARIABLE = 'GCE_METADATA_IP'

# ...

def _gce_metadata_endpoint():
  return 'http://' + os.environ.get(
    _GCE_METADATA_URL_ENV_VARIABLE,
    'metadata.google.internal')
```

`GCE_METADATA_IP` might seem like an awkward name. After all, `metadata.google.internal` is a URL, not an IP address. But it's probably best to match oauth2client's naming convention. That way users won't need to worry about setting two slightly-different variable names to configure both oauth2client and Tensorflow.
@google-ml-butler google-ml-butler bot added the size:S CL Change Size: Small label Jun 9, 2020
@gbaned gbaned self-assigned this Jun 9, 2020
@gbaned gbaned added this to Assigned Reviewer in PR Queue via automation Jun 9, 2020
@gbaned gbaned added the awaiting review Pull request awaiting review label Jun 12, 2020
@shawwn
Copy link
Contributor Author

shawwn commented Jul 20, 2020

Hello! Any word on this?

mihaimaruseac
mihaimaruseac previously approved these changes Jul 29, 2020
PR Queue automation moved this from Assigned Reviewer to Approved by Reviewer Jul 29, 2020
@google-ml-butler google-ml-butler bot added kokoro:force-run Tests on submitted change ready to pull PR ready for merge process labels Jul 29, 2020
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label Jul 29, 2020
@mihaimaruseac
Copy link
Collaborator

Can you fix sanity test please?

FAIL: Found 2 non-allowlisted pylint errors:
tensorflow/python/tpu/client/client.py:73: [C0330(bad-continuation), ] Wrong hanging indentation (add 2 spaces).

tensorflow/python/tpu/client/client.py:74: [C0330(bad-continuation), ] Wrong hanging indentation (add 2 spaces).

@gbaned gbaned removed awaiting review Pull request awaiting review ready to pull PR ready for merge process labels Jul 30, 2020
@gbaned
Copy link
Contributor

gbaned commented Jul 31, 2020

@shawwn Can you please check @mihaimaruseac's comments and keep us posted ? Thanks!

@gbaned gbaned added the stat:awaiting response Status - Awaiting response from author label Jul 31, 2020
@gbaned
Copy link
Contributor

gbaned commented Aug 5, 2020

@shawwn, Any update on this PR? Please. Thanks!

PR Queue automation moved this from Approved by Reviewer to Reviewer Requested Changes Aug 5, 2020
@shawwn
Copy link
Contributor Author

shawwn commented Aug 5, 2020

Sorry for the delay! I've pushed a new commit. Is the pylint error fixed now?

@gbaned gbaned removed the stat:awaiting response Status - Awaiting response from author label Aug 6, 2020
PR Queue automation moved this from Reviewer Requested Changes to Approved by Reviewer Aug 6, 2020
@google-ml-butler google-ml-butler bot added kokoro:force-run Tests on submitted change ready to pull PR ready for merge process labels Aug 6, 2020
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label Aug 6, 2020
@gbaned gbaned added ready to pull PR ready for merge process and removed ready to pull PR ready for merge process labels Aug 6, 2020
@google-ml-butler google-ml-butler bot added kokoro:force-run Tests on submitted change ready to pull PR ready for merge process labels Aug 7, 2020
@kokoro-team kokoro-team removed kokoro:force-run Tests on submitted change labels Aug 7, 2020
@gbaned gbaned added ready to pull PR ready for merge process and removed ready to pull PR ready for merge process labels Aug 7, 2020
@tensorflow-copybara tensorflow-copybara merged commit 14e0613 into tensorflow:master Aug 7, 2020
PR Queue automation moved this from Approved by Reviewer to Merged Aug 7, 2020
@shawwn
Copy link
Contributor Author

shawwn commented Nov 2, 2021

Achievement unlocked: JAX cited and implemented this PR: https://twitter.com/theshawwn/status/1455464111446315009

🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes ready to pull PR ready for merge process size:S CL Change Size: Small
Projects
PR Queue
  
Merged
Development

Successfully merging this pull request may close these issues.

None yet

6 participants